Contour Regularity Extraction Based on String Edit Distance

نویسندگان

  • José Ignacio Abreu Salas
  • Juan Ramón Rico-Juan
چکیده

In this paper, we present a new method for constructing prototypes representing a set of contours encoded by Freeman Chain Codes. Our method build new prototypes taking into account similar segments shared between contours instances. The similarity criterion was based on the Levenshtein Edit Distance definition. We also outline how to apply our method to reduce a data set without sensibly affect its representational power for classification purposes. Experimental results shows that our scheme can achieve compressions about 50% while classification error increases only by 0.75%.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

String Edit Analysis for Merging Databases

The first step prior to data mining is often to merge databases from different sources. Entries in these databases or descriptions retrieved using information extraction. may use significantly different vocabularies, so one often needs to determine whether similar descriptions refer to the same item or to different items (e.g., people or goods). String edit distance is an elegant way of definin...

متن کامل

PartSS: An Efficient Partition-based Filtering for Edit Distance Constraints

This paper introduces PartSS, a new partition-based filtering for tasks performing string comparisons under edit distance constraints. PartSS offers improvements over the state-of-the-art method NGPP with the implementation of a new partitioning scheme and also improves filtering abilities by exploiting theoretical results on shifting and scaling ranges, thus accelerating the rate of calculatin...

متن کامل

Un Sistema de Extracción de Información Basado en Ontologías para Documentos en el Dominio de las Tecnologías de Información An Ontology-Based Information Extractor for Data-Rich Documents in the Information Technology Domain

This paper presents an information extraction method, suitable for data-rich documents, based on the knowledge represented in a domain ontology. The extractor combines a fuzzy string matcher and a word sense disambiguation (WSD) algorithm. The fuzzy string matcher finds mentions of terms combining character-level and token-level similarity measures dealing with non-standardized acronyms and inc...

متن کامل

Distance Based Indexing for String Proximity Search

In many database applications involving string data, it is common to have near neighbor queries (asking for strings that are similar to a query string) or nearest neighbor queries (asking for strings that are most similar to a query string). The similarity between strings is defined in terms of a distance function determined by the application domain. The most popular string distance measures a...

متن کامل

A Conditional Random Field for Discriminatively-trained Finite-state String Edit Distance

The need to measure sequence similarity arises in information extraction, object identity, data mining, biological sequence analysis, and other domains. This paper presents discriminative string-edit CRFs, a finitestate conditional random field model for edit sequences between strings. Conditional random fields have advantages over generative approaches to this problem, such as pair HMMs or the...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009